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ABSTRACT 

The single instruction stream, multiple 
data stream MPP processor consists of 
16,384 bit serial arithmetic processors 
configured as a 128 x 128 array whose 
speed can exceed that of current 
supercomputers (Cyber 205)* This paper 
presents and discusses the 
applicability of the MPP for solving 
reaction network problems including the 
mapping of the calculation to the 
architecture, and CPU timing 
comparisons. 
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INTRODUCTION 


"supercomputers". Our model was 
developed on the NASA-Langley CDC-STAR 
computer and is currently being 
exercised on the NASA-Langley Cyber 
205, the NCAR CRAY-1 and a FAC M240 in 
Nagoya University, Japan. The 
execution times are 0.025 


CPU-sec 

grids-time step 


on the FAC t 


40 , 


0.007 


CPU-sec 

grids-time step 


on the 


CRAY-1, and 0.70 


CPU-sec 

grids-time step 


VAX 11/780. Thus, a 24-hour simulation 
on the CRAY-1 for the eastern United 
States with 9500 grid points requires 
100 CPU-minutes. 


A detailed model which describes the 
transport and removal of photochemical 
oxidants, and acidic species and 
precursors in the troposphere has been 
under development for the past nine 
years. The present analysis consists 
of about 30 coupled three-dimensional 
time-dependent non-linear partial 
differential equations and about 50-100 
coupled non-linear ordinary 
differential equations. 

The model is representative of a number 
of comprehensive Eulerian 
transport/chemistry models being 
developed for regional air pollution 
problems. However, these models are 
only feasible when run on 


Our experience has shown that 
transport/chemistry models can execute 
about 70-100 times faster on the 
"supercomputers”. However, 100 CPU- 
minutes/simulation-day is still too 
large for most applications. Since 
typical applications require 
simulations of seven to ten days. 
Therefore, to exercise these models 
various simplifying assumptions are 
used to decrease the CPU time. 

However, these assumptions add 
additional errors and uncertainties to 
the model results. Faster computers 
will enable the execution of the "best- 
science" model version. 

Currently about 90% of the CPU time is 
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spent doing the chemistry 
calculations. The chemistry introduces 
the stiffness, the coupling, and the 
non-linearity into the model. Thus, 
the highest priority in continued model 
development is to search for ways in 
software and hardware to reduce the 
chemistry calculation. The purpose of 
this paper is to describe our attempts 
to exploit massively parallel computer 
architectures to accelerate the 
chemistry calculations. 

MODEL OVERVIEW 

The regional-scale combined 
transport/chemistry/ deposition model is 
Eulerian and treats 50 chemical 
species. Thirty species are advected, 
while the remaining species are short- 
lived and are modeled using pseudo- 
steady state methods. The mathematical 
analysis consists of partial 
differential equations for the advected 
species and additional algebraic 
equations for the steady-state 

species. The advected species satisfy 

3C . 

— - + V (VC. ) = V • K» V c . 

3t 1 1 (1) 

+ R. + S,-G., i = 1,...,30; 

i ii 

where is the gas-phase concentration 
of the ith chemical species, V is the 
wind velocity vector, K is the eddy 
diffusivity tensor, denotes the 
chemical reaction term, is the 
source term, and G. is used to describe 
the mass transfer between the gas and 
condensed phases. The algebraic 
equations for the gas-phase species 
assumed to be at steady state are 
written as 


Simulation of regional transport, 
chemistry and deposition as described 
by Equations (1) and (2) requires 
numerical integration. The method 
presently used is a combination of the 
concept of fractional time steps and 
one-dimensional finite elements. This 
is referred to as Locally One- 
Dimensional, Finite-Element Method 
(LOD-FEM) • The LOD procedures 
(Mitchell, 1969) split the multi- 
dimensional partial differential 
equation into time dependent, one- 
dimensional problems which are solved 
sequentially. The transport equations 
are solved using a Crank-Nicolson 
Galerkin finite element technique. 
Chemistry and mass transfer equations 
are solved using an adaptation of the 
semi-implicit Euler method proposed by 
Preussner and Brand (1981). 

SCOPING STUDIES 

Test Problem 1: Chemical Network 

Problem 


To evaluate the ability of the MPP to 
calculate chemical network applications 
a simple four species test problem was 
selected. The four species (C^, C 
C^, C 4 ) are involved in the following 
chemical reactions : 


C 1 + C 2 



(a) 



(b) 




' C 50> = °' 


= 31, 


These equations are representations of 
general chemically reactive flow 
problems. 


The transport equations describing this 
system is represented by Eq(1) with 
i=1,2,3 and 4. 

As mentioned in the model overview 
section, one way of numerically solving 
complex transport chemistry network 
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problems is to split the equation into 
transport and chemistry parts. The 
chemistry calculations using this 
technique requires solving the set of 
equations 

3C . 

it" = R i 1=1 ' •••' 

(5) 

# of species 


100 time steps on the MPP of this 4 
species mechanism at 16/384 grid points 
is 0.293 CPU-seconds. The same problem 
was executed on the VAX- 11/780 and 
required 138 CPU-seconds. Thus for 
this chemical network problem the MPP 
executed a factor, of 470 times faster 
than VAX 11/7801! 

Test Problem 2 s NC> x Transport in 
Eastern Uhited States 


at each grid point in the discretized 
space. 


The use of the semi-implicit Euler 
method to solve Eq(5) results in the 
equations 


dC. 
l 

dt 

r 



+ 


z n c 

n 1 m 

i m 


( 6 ) 


i=l. 


.4 


This set of ODE-IVP's is solved within 
each transport time step, i.e./ 


t < t < t 
o r transport 

Now consider the case when we have 
16/384 grid points in the discretized 
spatial domain. Therefore each 
chemical calculation within each 
transport step requires the solution of 
16,384 sets of Eq(6). To implement the 
solution of these equations on the MPP 
requires first the choice of how to map 
the equations to the architecture. In 
this case we have chosen to simply view 
each processor as a grid point in the 
discretized space/ and to have each 
processor solve its own set of Eq(6). 
The algorithm for solution of Eq(6) is 
written in Parallel Pascal and resides 
on the VAX. The initial conditions and 
constants are distributed to each 
processor and the algorithm is executed 
on the MPP and output is sent back to 
the VAX. 


The CPU time required for execution of 


To test combined transport/chemistry 
network problems on the MPP a 3- 
dimensional test problem describing the 
transport and chemistry of NO/ N0 2 , 0 3 , 
and HNO^ in the lower troposphere was 
selected. The governing set of 
equations is given by Eqn. (1). An 
oversimplified chemical mechanism is 
used in this test calculation, i.e.. 


NO + 0 3 N0 2 + 0 2 

(e) 

->* 


NO„ + hv NO + 0, 

(f) 

2 +0 2 3 

NO z + OH -*■ HN0 3 

(g) 


(The OH concentrations are given by an 
empirical formula and the 0 2 
concentrations are assumed constant. 


Sample results for this test problem 
calculated on a VAX 11/780 are shown in 
Figures 1 and 2. Presented are the NO^ 
emissions for the eastern United 
States, and the 24-hour averaged 
predicted surface concentrations of NO, 
N0 2 , and HNO^. The meteorological 
conditions simulated are those of July 
4, 1974. The grid system used in the 
simulation was 32 x 32 x 16, and a 
transport time step of 15 minutes and a 
chemistry time step of 1 second was 
used. 

This combined transport/chemistry 
problem is currently being run on the 
MPP. There are two choices for mapping 
this problem to the MPP. One method is 
to perform the chemistry calculations 
on the MPP and the transport part of 
the calculation on the VAX. This 
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Figure 1. The emission of N0 X at July 4, 1974 at surface. 



a) NOx b) HN0 3 


Figure 2. Averaged concentration of July 4, 1974 at surface. 
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method is currently being tested* The 
other method is to perform the entire 
calculation on the MPP. We are 
currently developing an algorithm to 
solve the sets of tridiagonal matrices 
on the MPP which arise from the 
transport processes. 

The above test calculation indicates 
the MPP is well suited for chemical 
network problems where each node can 
hold the entire mechanism. Current 
memory restrictions limit the size of 
the chemical mechanism that can be 
solved in this fashion. At present 
each processor can hold 32 32-bit 
variables. (It is planned to increase 
the storage in the near future.) 
However/ it is possible to handle 
larger chemical mechanisms. One way is 
to group processors together. For 
example if 128 words are required at 
each node then four processors can work 
together. This in turn would reduce 
the maximum number of grids possible by 
a factor of 4. Another way is to make 
use of the staging memory. 

SUMMARY 

The suitablity of the MPP computer for 
calculation of chemical network 
problems is under evaluation. To date 
the MPP has been used to calculate a 
test problem which represents one 
component of a sophisticated chemically 
reactive flow problem. Specifically 
the set of coupled ODE-IVP*s describing 
the chemical reactions occuring at 
16/384 spatial grid points was 
calculated. This problem is ideally 
suited for the MPP because by using 
operator splitting, the chemistry at 
each grid point acts independently from 
that of the other grids (within each 
transport time step). This test 
calculation showed that the MPP can 
perform a 100-time-step calculation 470 
times faster than the same calculation 
on the VAX 11/780. Also since nearly 
90% of CPU time of large chemically- 
reactive flow problems is spent doing 
the chemistry calculations, the MPP 


architecture offers great potential for 
CPU savings for model applications. 
Coupled transport-chemistry problems 
are now being tested on the MPP. 
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